Dynamic pathological analysis reveals a protective role against skin fibrosis for TREM2-dependent macrophages

Rationale: Systemic sclerosis (SSc) is a chronic and incurable autoimmune disease with high mortality rates, and skin fibrosis is one of distinguishing hallmarks in the pathogenesis. However, macrophage heterogeneity regulating skin fibrosis remain largely unknown. Methods: We established mouse disease model and performed single-cell RNA-sequencing (scRNA-seq) to resolve the dynamic and heterogenous characteristics of macrophages in skin fibrosis, and the role of TREM2-dependent macrophages in the pathological process was investigated using knockout mice and intraperitoneal transferring TREM2+ macrophages combining with functional assays. Results: We show that TREM2-expressing macrophages (TREM2+ MФs) accumulate in injured skin of mice treated by bleomycin (BLM) and human SSc, and their gene signatures and functional pathways are identified in the course of disease. Genetic ablation of Trem2 in mice globally accelerates and aggravates skin fibrosis, whereas transferring TREM2hi macrophages improves and alleviates skin fibrosis. Amazingly, we found that disease-associated TREM2+ MФs in skin fibrosis exhibit overlapping signatures with fetal skin counterparts in mice and human to maintain skin homeostasis, but each has merits in skin remodeling and development respectively. Conclusion: This study identifies that TREM2 acts as a functional molecule and a major signaling by which macrophage subpopulations play a protective role against fibrosis, and disease-associated TREM2+ MФs in skin fibrosis might undergo a fetal-like reprogramming similar to fetal skin counterparts.


Single-cell suspension preparation and FACS analysis
Mice's skin was harvested and incubated with 0.25% Dispase II (04942078001, Roche) for 1-1.5 h at 37°C after removing fat tissue.The separated epidermal and dermal layers were cut into small pieces and digested by 1 mg/ml collagenase IV (40510ES60, Shanghai Yeasen Biotechnology Co.) in RPMI1640 medium (C11875500BT, HyClone), then incubated at 37°C for 2-2.5 h.The suspension was filtered through 70 μm mesh.After washing, the cells were resuspended in 1XPBS for flow cytometry.Living cells of epidermal or dermal suspension were sorted on a FACSAria III flow cytometer (BD Biosciences) using 4', 6-diamidino-2-phenylindole (DAPI) staining.Then sorted epidermal and dermal cells were mixed with a 1:5 ratio for ScRNA-seq analysis.

Bulk RNA sequencing analysis of skin tissue from mouse skin fibrosis
The quality and concentration of RNA were assessed by Agilent Bioanalyzer 2100 and Nanodrop.
RNA library preparation and sequencing were performed by Novogene (Beijing, China) on the Illumina NovaSeq 6000 platform.Low-quality sequencing reads were filtered using adapter removal if the read length was < 50 bp and the quality value < 3. The remaining reads were then aligned via STAR v2.7.1a [2] using the mouse reference genome (mm10) and GENCODE vM21 annotations.We calculated gene expression levels of all transcripts as fragments per kilobase of exon per million mapped fragments (FPKM) by cufflinks [3].Differentially expressed genes were determined by DESeq2 [4] and identified as Benjamini-Hochberg adjusted P-value < 0.05.Gene ontology (GO) biological process enrichment analysis was performed using DAVID Bioinformatics Resources 6.8 (https://david.ncifcrf.gov/).

Histology, immunohistochemistry, and Immunofluorescence analysis
Dorsal skin samples from injected sites were collected and fixed with 4% paraformaldehyde for 24 h.The samples were dehydrated in ethanol and embedded in paraffin according to standard protocols.

ScRNA-seq library construction and sequencing
Single cells were obtained as gel bead-in-emulsions (GEMs) using the 10X Genomics (Pleasanton, CA, USA) Chromium platform based on the manufacturer's protocol.Briefly, cell suspensions of skin tissue with reverse transcription master mix were loaded onto the 10X Genomics Single Cell 3' chip and partitioned into GEMs with gel beads precoated with oligonucleotides containing a poly-dT primer sequence to capture released mRNAs, a 16-bp 10X barcode for cell indexing, and a 10bp unique molecular identifier (UMI) to distinguish between transcripts.After reverse transcription, barcoded cDNAs for each sample were amplified to generate a sufficient volume for library construction using the Single Cell 3' Reagent Kit (v3 chemistry).Libraries were subjected to 150bp paired-end sequencing performed by Novogene (Beijing, China) on a NovaSeq 6000 system (Illumina, San Diego, CA, USA).

Data preprocessing for single-cell gene expression profiling
The Cell Ranger 4.0.0 pipeline (10X Genomics) was used to align and aggregate all reads from scRNA-seq to the mouse (Mus musculus) reference genome (mm10) [10x Genomics provides prebuilt references for mouse genomes: https://cf.10xgenomics.com/supp/cell-exp/refdata-gex-mm10-2020-A.tar.gz] with default settings and produce a balanced "filtered feature bc matrix".

Cell heterogeneity, clustering, integration, and visualization
The resulting gene-barcode matrix was imported into the SCANPY software (Version 1.7.

1) [5],
where sample information was added.Subsequently, quality control was conducted to ensure that all samples analyzed contained highly consistent quality.First, we applied quality control to cells according to three different metrics step by step, including the total UMI count, the number of detected genes, and the proportion of mitochondrial gene count per cell.Specifically, we filtered cells with less than 2000 UMI count or 200 detected genes, as well as cells with more than 5% mitochondrial gene count.To remove potential doublets, we also removed cells with the number of detected genes above 6,000.Furthermore, we removed potential doublets predicted by Scrublet [6].Next, after quality control, we applied the library-size correction method to normalize the raw count matrix by using normalize_total function in SCANPY.We employed the workflow of SCANPY to perform dimension reduction and graph-based unsupervised clustering on scRNA-seq data.Briefly, we first selected 2000 highly variable genes (HVGs) for downstream analysis by using scanpy.pp.highly_variable_genes function with parameter "n_top_genes=2000".Then, effects of the total count per cell and the percentage of mitochondrial gene count were regressed out by using scanpy.pp.regress_out function.LIGER (Version 2.0.1)software [7] was used to integrate the different samples and variation in the cells' transcriptional profile was visualized by uniform manifold approximation and projection for dimension reduction (UMAP) function in SCANPY, and the initial dim number for PCA analysis was set to 30.PCA components with standard deviation > 1.25 (generated by ElbowPlot) and JackStraw score < 0.05 (generated by Seurat R package [8] ScoreJackStraw function) were used to generate two-dimensional UMAP.Leiden algorithm [9] was used to identify clusters within cell populations (Leiden clustering resolution = 0.5, n_pcs=30).Next, each sample Scanpy object file was converted to Seurat object file with SeuratDisk R package (https://github.com/mojaveazure/seurat-disk).Differential expressed genes were calculated in Seurat R package using the FindMarkers function, which performed a two-sided Wilcoxon rank sum test restricted to genes expressed in at least 25% of cells in either of the two populations compared, and with a natural log fold change cut-off of 0.25.All p-values were adjusted for multiple testing using the Benjamini-Hochberg method.
To compare with public mouse fetal skin and human SSc scRNA-seq data, public data were downloaded and analysed according the workflow of Seurat R package.Integration analysis between public scRNA-seq data and our mouse scRNA-seq data were performed within R v3.6.2 and with Seurat v 3.1.1[10].Briefly, the 2000 anchors were used to integrate all Seurat objects created above: (i) by finding the integration anchors (Seurat::FindIntegrationAnchors), (ii) by integrating the objects (Seurat::IntegrateData), (iii) using all.genes to scale the data and regress out the nCount_RNA, and (iv) calculating the top 30 PCAs and using them for dimensional reduction and identifying cell clusters by using a resolution of 0.5 and 1.To identify the differentially expressed genes, we used the Seurat::FindMarkers function and compared every period to its corresponding control for every cell type.For visualization of the scRNAseq data the visualization methods in Seurat and ggplot2 R package were implemented.For gene set activate analysis in scRNAseq data, AUCell method from AUCell Bioconductor packages were used to calculate AUCell score [11].

Cell-Cell communication analysis of mouse scRNA-seq data
To investigate cell-cell communication in mouse scRNA-seq data, we inferred cell-cell interactions using the NATMI method [12].NATMI takes a slightly different approach, focusing on learning interactions between cells using bulk or single-cell expression data, not addressing the specifics of how these regulatory interactions impact downstream gene expression.NATMI uses large-scale ligand-receptor databases to create prior knowledge networks.It then calculates a weight for each interaction between genes, based on three expression-based metrics from the dataset of interest, which are used to determine cell type interactions.Meanwhile, NATMI accept mouse scRNA-seq data as input to analyse cell-cell interactions.Briefly, Seurat normalized expression values were converted to CPMs and then grouped by cell subtype and sample prior to NATMI analysis (https://github.com/asrhou/NATMI).The python 3 versions of NATMI ExtractEdges with the suggested dependency versions were run on each sample with default settings.The predicted ligandreceptor interactions were then read into R for further analysis.For a cell-cell connection to be kept for further analysis, the ligand and receptor must be expressed in >10 cells.Connections were also filtered out if receptor or ligand detection rate < 0.2.Cluster autocrine-signaling and interactions where the ligand and receptor were the same genes were removed for data presentation and interpretation.Furthermore, interactions that were not seen in at least two samples were removed.

Gene Set Enrichment Analysis
To investigate biological pathways correlated with the course of skin fibrosis, we performed a GSEA analysis [13].The needed cells from each sample were aggregated to create pseudo-bulk counts, then ClusterProfile R package use the pseudo-bulk counts to do GSEA.The reference gene sets of GSEA, which was called as "c2.cp.kegg.v7.4.symbols.gmt,and c5.go.bp.v7.4.symbols.gmt"were downloaded from the Molecular Signatures Database.The threshold was set at P < 0.05 and false discovery rate (FDR) q < 0.1.Function and pathway enrichment analysis of differential expressed genes were also performed with ClusterProfile.

Public scRNA-seq data processing
Single-cell RNA-seq counts were obtained from the GEO database as the raw expression matrix of unique molecular identifier counts (UMI counts; indicative of the number of unique RNA molecules detected) for each gene in each cell group.UMI counts from each dataset were filtered on the following criteria: cells with > 1000 UMI counts, > 200 features, < 7500 features, and < 25% mitochondrial gene expression, then normalized and log-transformed using the LogNormalize function in the Seurat package.We integrated all the datasets using the 'scTransform' package for batch correction and scRNA-seq data integration.It was then followed by a combined principal component analysis (PCA), computing 50 principal components, of which the first 30 (based on an elbow plot) were used to compute the Uniform Manifold Approximation and Projection (UMAP) analysis for dimensionality reduction.For the differential expression analysis between Trem2 positive cells and Trem2 negative cells, the cutoff values log (Fold Change) and P-value were set to 0.3 and 0.05, respectively.

Consistence with publicly available mouse skin single-cell data
In order to demonstrate the consistency of our cell identity assignments with previous efforts in mouse skin single-cell data, we acquired publicly available mouse skin single-cell data(GSE129218) and retained the cell identity definitions used in the prior study.Combining cells of the same type into a pseudo-bulk, we calculated the correlation between cell type pseudo-bulks from our dataset and the publicly available data.The results indicate that our cell identity definitions align consistently with the definitions from the previous study.

Genetic similarity of TREM2 + macrophages between mouse fetal skin and BLM-treated skin
To demonstrate the genetic similarity between TREM2 + macrophages in mouse fetal skin and TREM2 + macrophages in mouse systemic sclerosis (SSc), we employed two distinct datasets of mouse fetal skin single-cell data (GSE122043, GSE131498).We integrated the TREM2 + and TREM2 -macrophages from both datasets with the TREM2 + and TREM2 -macrophages in our dataset.Additionally, we combined cells from the above six different source into individual pseudobulks.Subsequently, based on the Euclidean distance metric, we performed hierarchical clustering with "hclust" method in R package stat.The outcomes of this analysis revealed that, compared to TREM2 -macrophages, TREM2 + macrophages in both mouse fetal skin and mouse SSc exhibit a higher degree of genetic similarity.
In order to demonstrate that the occurrence of an intersection of 24 genes in the overlap of three upregulated gene sets is not a result of random chance, we employed a permutation test.Specifically, within each respective dataset of macrophages scRNA-Seq, we randomly selected the same number of expressed genes as the upregulated genes in Trem2 + macrophages.Subsequently, we calculated the number of genes present in the intersection of the three selected gene sets.This process was repeated 1000 times, and the frequency of occurrences where the intersection size exceeded 24 was computed.The resulting empirical p-value was determined to be 0.0076.As a result, we conclude that the observed overlap of 24 genes among upregulated genes in Trem2 + macrophages from the mouse fetal skin dataset (GSE122043), the mouse skin fibrosis dataset (our data), and the Trem2 + macrophages dataset (GSE131498) is not a product of random chance.

Culture of bone marrow derived macrophage (BMDMs) in vitro
According to previously published protocols, BM was isolated and induced into macrophages in vitro [14].Briefly, femurs and tibias were removed from WT or TREM2 KO mice after being sacrificed.The bones were immersed in 75% ethanol for 5 min, then put in PBS.BM was collected by flushing with 10 ml PBS.Red blood cells were lysed for 3 min with 1X commercial lysis buffer (00-4300-54, eBioscience).After neutralizing, the cells were filtered through a 70-μm cell strainer (BD Biosciences; Franklin Lakes, NJ, USA; 352350).Cells were washed and resuspended in 10 ml of DMEM medium containing 10% fetal bovine serum, 100 U/ml penicillin, 100 μg/ml streptomycin, and 10 ng/ml murine granulocyte-macrophage colony-stimulating factor (GM-CSF, 315-03, PeproTech).On d 4, 10 ml of complete medium was added to the cultures.On d 7, the cells were stimulated with 1 nM 1α, 25-dihydroxycholecalciferol (D1530, Sigma-Aldrich) for 24 h.Cells were harvested for BMDM injection in vivo.

Enzyme-linked immunosorbent assay (ELISA)
50mg skin tissue was collected and rinsed with 1×PBS, homogenized in 500 uL of 1×PBS and stored overnight at -20°C.After two freeze-thaw cycles were performed to break the cell membranes, the homogenates were centrifuged for 5 minutes at 5000×g (RCF), 4°C.The hydroxyproline (HYP) in the supernatant was measured using an ELISA kit (CSB-E08839m, CUSABIO), according to the manufacturer's instructions.